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Abstract 

O 

Under dilute in vitro conditions transcription factors rapidly locate their target sequence on DNA by 
using the facilitated diffusion mechanism. However, whether this strategy of alternating between three- 
dimensional bulk diffusion and one-dimensional sliding along the DNA contour is still beneficial in the 
crowded interior of cells is highly disputed. Here we use a simple model for the bacterial genome inside the 
cell and present a semi-analytical model for the in vivo target search of transcription factors within the 
facilitated diffusion framework. Without having to resort to extensive simulations we determine the mean 
search time of a lac repressor in a living E. coli cell by including parameters deduced from experimental 
measurements. The results agree very well with experimental findings, and thus the facilitated diffusion 
picture emerges as a quantitative approach to gene regulation in living bacteria cells. Furthermore we see 
that the search time is not very sensitive to the parameters characterizing the DNA configuration and 
that the cell seems to operate very close to optimal conditions for target localization. Local searches as 
implied by the colocalization mechanism are only found to mildly accelerate the mean search time within 
our model. 
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Introduction 

<N 

Transcription factors (TFs) are able to locate and bind their target sequence on DNA at surprisingly 
high rates. This became clear when in 1970 it was measured that in vitro the lac repressor associates 
with the operator at a rate of k a — 7 x 10 9 M _1 s _1 [I]. This is about two orders of magnitude faster than 
a rate calculated with the well-known Smoluchowski formula for three-dimensional diffusion control [2]. 
The results obtained in the in vitro experiments by Riggs et al. and by Winter et al. were successfully 
explained with the by now classical facilitated diffusion model, introduced by Berg, von Hippel and 
co-workers [Hill]: the TF alternates between three-dimensional diffusion through the bulk solution and 
sliding along the DNA contour which can be considered as one-dimensional diffusion. While a large 
majority of subsequent reformulations of this target search problem are based on this facilitated diffusion 
model [SHE] ! there are also critical reviews focusing on limitations of the traditional model [SUTO] • 

Even if it is accepted by most of the scientists that in vitro TFs perform facilitated diffusion to find 
their targets, there is a vivid debate on whether this mechanism indeed plays a role in vivo. The interest 
in this long-standing topic was boosted by the development of new experimental techniques, namely 
single-molecule assays studying DNA-binding proteins, or more generally the diffusion of proteins within 
cells [TTMl8| . After finding indirect evidence some years ago, Elf and coworkers recently demonstrated 
that the lac repressor does display facilitated diffusion in live Escherichia coli (E. coli) cells [ 191120) . 

Thus it is important to study how the present facilitated diffusion models need to be translated 
to the in vivo situation. In comparison to the dilute situation studied in vitro the most important 
changes are: the influence of the confinement to the cell body or the nucleoid and the compactified DNA 
conformation, and the impact of the presence of many large biomolecules. The latter, which is often 
referred to as macromolecular crowding has two major effects: the equilibrium for DNA-binding proteins 
is shifted favoring the associated state and the diffusion in the cytoplasm is slowed down 121, 22]- There is 
an on-going debate whether this reduced diffusion is still Brownian, following experimental evidence that 
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for larger molecules such as mRNA [23,24 or lipid granules [35] the motion follows the laws of anomalous 
diffusion 26,27). Indeed, there are indications that particles of the size of several tens of kilo Daltons 
exhibit anomalous diffusion |28H29j . In what follows we model TFs in the bulk by normal Brownian 
diffusion and point at potential implications of anomalous diffusion in the conclusions. 

We note that theoretical work on facilitated diffusion in vivo has also been reported by Mirny and 
coworkers as well as by Koslover and coworkers [9,30 . A different approach for the situation in living 
cells, based on a fractal organization of the chromatin in the nucleus, showed that also in eukaryotes 
facilitated diffusion can be beneficial [31] . 

With respect to the impact of the cell's finite size Foffano et al. recently studied the influence of 
(an-)isotropic confinement on the facilitated diffusion process for rather short DNA chains 32 . To build 
a theoretical model for facilitated diffusion on the entire genome in living cells we shortly review what 
is known about the organization of the bacterial DNA |33) . The emerging general consensus points 
at a distinct separation of the genome into connected subunits, that may be dynamic. Using atomic 
force microscopy the size of structural units of the E. coli chromosome was studied, finding units of size 
40 inn and 80 nm [34] . By means of two complementary approaches the average size of the structural 
domains was measured to be 10 kilobasepairs (kbp) [35]. Romantsov et al. studied the structure with 
fluorescence correlation spectroscopy, yielding units of size 50 kbp with a diameter of (70 ± 20) nm [36] . 
Chromosome conformation capture carbon copy(5C) was used to determine a three-dimensional model 
of the Caulobacter genome [37] . For the same bacterium Viollier et al. determined that the location of 
genes on the chromosome map correlates linearly with its position along the cell's long axis [38] , 

Based on these experimental observations several models for the DNA structure in bacterial cells have 
been proposed: entropy is spotted to be the main driver of chromosome segregation, and ring polymers 
are used to model the bacterial chromosome J321HQ]. Buenemann and Lenz showed that a geometric 
model based on a self-avoiding random walk (SAW) is sufficient to explain the linear positioning of loci 
along the cell's longest axis [H]. Finally, the chromosomal structure and, in particular, the accurate 
positioning of loci was proposed as resulting from regulatory interactions [421143] . 

In this paper we survey if it is possible to extend our previous generalized facilitated diffusion model 
[44] to the in vivo situation and compare the results with the ones obtained by Koslover et al. [30] , 
Therefore in the following section we detail how we obtain a coarse-grained model for the bacterial 
genome and state our semi-analytical model for the search process. Then the general theory will be 
applied to the specific case of a lac repressor in an E. coli cell, and we favorably compare our results with 
related experimental measurements [19] . Finally we conclude our findings and give an outlook on future 
research directions. 

Theory 

The quantity we investigate is the average time a TF needs to find a target sequence in a living bacterial 
cell after starting at a random position within the cell. In principle it is possible to apply our previous 
generalized diffusion model using rescaled rates, lengths and diffusion constants to account for the crowded 
in vivo environment [44] . However, for parameters typical for the interior of cells the effective contact 
radius between TF and DNA is larger than the average distance between neighboring DNA segments. 
Consequently a direct translation is not possible. 

Moreover, as we will see below, already the simpler one-state model of facilitated diffusion is sufficient 
to obtain a fairly good estimate of the experimental results without any further free parameters. Thus 
we do not distinguish between search and recognition states of the TF-DNA complex [5 . Intersegmental 
jumps and/or transfers [5j[Hl[IH][I5] of TFs between DNA segments, that are close-by in the embedding 
space but distant when measured in the chemical coordinate along the genome, are to some extent 
indirectly included in terms of re-attachmcnt to the DNA within one of the geometric subunits of the 
chromosome. In future studies these effects could be explicitly included to refine the model. 
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Our approach is based on the general picture of the facilitated diffusion mechanism: the TF diffuses 
three-dimensionally through the bulk solution until it encounters a stretch of DNA to which it can bind. 
Then a sliding motion along the DNA contour is possible, during which the TF probes for the target. 
If the target is not found, the TF will dissociate from the chain after a certain time span and resume 
its 3D-diffusion through the cell until the next binding event. This scheme continues until the target 
is found. The major difference to the dilute in vitro situation lies in the DNA conformation which is 
heavily influenced by the confinement to the cell volume or the nucleoid volume: As the contour length 
of (the typically circular) bacterial DNA is about three orders of magnitude larger than the longest cell 
axis in which it is placed, there is clearly a need to compact it. To proceed we present our model for the 
compacted genome. 

Model for the compacted genome 

Without dwelling on details to which extent nucleoid-structuring proteins and/or supercoiling is respon- 
sible for DNA compaction in bacterial cells, we adapt the model of Buenemann and Lenz and assume 
that the DNA is assembled structurally into spheres ('blobs') containing one loop each [IT]. Thus, the 
whole genome is modeled as a closed SAW of these uniformly large blobs on a lattice representing the 
nucleoid volume (here we diverge from ref. |41) . where the full cell volume was taken). To mimic the 
cylindrical shape of the nucleoid one of the cuboid lattice's edges is taken to be longer than the other 
two of equal length. 

The key quantities are the blobs' radius of gyration r g and the number of basepairs within a blob, 
Nb. While the latter parameter determines how many blobs make up the DNA, since the number of bps 
on the DNA is a fixed parameter, the first one effectively determines the lattice size (see figure [Ij. 




Figure 1. Two-dimensional schematic of the DNA conformation. The circles denote single 
DNA blobs. The lattice spacing is twice the blob radius: d g — 2r g . A part of an exemplary search 
trajectory is depicted by the arrow. 

To obtain individual DNA conformations we follow a routine similar to the one described in ref. [41] : 
as a starting point we use a closed loop of minimal extension which touches both end faces along the 
longest cell axis. Then the chain is elongated by inserting hooks at random positions until it reaches 
the desired length (due to the form of the algorithm only chains with an even number of blobs are 
considered). Only elongation steps which yield a conformation within the nucleoid volume are executed. 
Afterwards the genome is equilibrated in the following manner: we randomly choose one of the three 
transformation types of the MOS algorithm [46]. Then it is checked if the resulting conformation is still 
an SAW, otherwise the old conformation is kept. Finally only attempts are counted in which the SAW is 
still confined to the nucleoid volume. This is repeated 100,000 times for each individual model genome. 
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Afterwards the resulting DNA conformations are centered on a larger lattice representing the full 
cell volume and remain unchanged during the subsequent simulation of the target search process. This 
approach is affirmed by recent results that DNA dynamics only have little effect on target search rates [3U] ■ 
For the sake of simplicity we assign the target to be in a blob in the middle of the DNA. 

Target search process 

The TF is assumed to start its search at a random position in the cell volume and its motion is modeled 
as a random walk on the effective lattice (fig. [T]), during which we keep track of how often sites containing 
a blob are passed. The search process is schematically depicted in fig. [2] 




Figure 2. Schematic of the microscopic events within a blob (without target). B denotes a 
bound TF, and U an unbound TF within a blob. Finally, S represents a searching TF which is currently 
not in a blob. 

The TF starts its search diffusing in 3D (S-state). With certainty (probability f) after some time 
it will encounter a blob, which it enters in its unbound state (U). We first study the case where this 
blob does not contain the target DNA. Based on the microscopic model be outlined below, we assign a 
probability p r that the TF will bind to the DNA within this blob. If so it changes to the B-state. As there 
is no target to be found on the DNA, after some time the TF will dissociate and return to the unbound 
U-state. With probability p r it can bind again, or it may leave the blob (with probability I — p r ) and 
start a new random walk on the lattice (S-state). The same procedure will take place when subsequent 
blobs are encountered. 

A qualitatively new event occurs when the site containing the target DNA is encountered for the first 
time. Now the tendency to quit the corresponding blob competes with the probability to find the target. 
For this reason, in general several encounters with the target blob are necessary. The corresponding 
scheme is depicted in figure [3] 




Figure 3. Schematic of the microscopic events within the target blob. Same notation as in 
the previous figure. Additionally, T denotes a TF which has found the target. 

Once again after entering the blob in the unbound U-state, with probability I —p r not a single binding 
event takes place. However, if the TF binds to DNA (with probability p r ), subsequently with probability 
Pt the target will be found (T-state) before dissociating. If the target is not found and the TF dissociates, 
again with probability 1 — p r , the blob is left. Otherwise (with probability p r ) a new chance to find the 
target while being bound is opened up. As in the simpler scheme without target, a new random walk 
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(S- state) is started on a neighboring site if the blob is left. To proceed we relate the probabilities p r and 
p t to microscopic quantities and determine the time steps of the individual processes, before calculating 
the typical search time for the target. 



Microscopic model 

To determine p r , that is the probability to bind to DNA after entering a blob or after dissociation 
from the DNA within the blob we employ the approximation that locally the DNA can be treated as a 
random coil [3"ll44] . Thus we have to solve the diffusion equation for an initially homogeneous probability 
distribution within a sphere of radius r g . Inside this sphere nonspecific association to a basepair on the 
DNA occurs with the constant, intrinsic rate fc ass (in units of M~ 1 s~ 1 ). We introduce a second concentric 
sphere of radius ar g whose surface is absorbing, modeling the TFs leaving the domain of the blob. Thus, 
the dimensionless quantity a measures (in units of r g ) where the blob's area of influence ends, see below 
and Supporting Information (SI) SI. The corresponding problem is solved in the SI SI, yielding the 
binding probability 

1 ^4>{i) m 

a + (a - 1)7 2 0(7) 



with the dimensionless quantity 7 = r g y/ k/Dz- Here D3 denotes the 3D-diffusion constant, and K — 
fcassAV Moreover, n = 3/(47rrg) represents the density of DNA within the coil. In Eq. [1] we also 
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introduced the auxiliary function 0(7) = (7Coth(7) — l)/7 2 [47] . 

Note that p r is a monotonic function of 7. Keeping the values of k, a and r g fixed, for decreasing, yet 
finite values of D3 the probability to escape the blob (which is given by 1 — p r ) becomes smaller, as in 
this case the TF moves slower and spends more time within the blob, where it can be caught by a stretch 
of DNA. Exactly at D3 = one obtains p r = 0, an apparent paradox. However, while it is true that 
an immobile TF is unable to leave a blob, the converse argument that the TF will bind to DNA with 
certainty is not obvious, as binding requires the motion of a TF towards DNA within the blob. Because 
this complementarity is implicitly assumed in the present model, it only yields meaningful results for 
finite values of 7. Only this situation will be considered in the following. 

If binding occurs, the average time this takes is given by a somewhat complicated formula for arbitrary 
values of a (see SI SI). Here we report the simpler result for the special case a — 2. This case is of 
interest, as in the numerical evaluation we use the value a = \/23/5 ~ 2.14, see below. 

_ Q=2 _ 1 20 + (8 7 2 - 30)0(7) + (47 2 - 36) 7 V (7) (2) 



2k (2 + 7 2 ^(7))(2 + ( 7 2 - 6)0(7)) 
Conversely, the average time it takes the TF to leave the blob is 

a=2 1 6-20- 1 (7) + 7 2 (40( 7 ) + |) + ^0(7) 
Te 2k 2 + 7 2 0( 7 ) ■ () 

While diffusing in 3D, a single random walk step on average takes T3D = d g /(6D 3 ). Once the TF binds 
non-specifically to the blob containing the target, the probability to find the target before dissociating 
can be found by considering a one-dimensional diffusion problem. We assume that the target is located 
in the middle of the corresponding blob. Then we consider a DNA stretch of length L — N^b/2 with the 
target at one end. Here b denotes the size of a basepair, b = 0.34 nm. 

Due to the DNA's coiled conformation within a blob, we use the standard assumption that the first 
binding event occurs at a random position on the DNA and that dissociation and reassociation positions 
are completely uncorrelated, see for example [48] . Formally this implies that the TF initially is uniformly 
distributed on the DNA along which it diffuses with the diffusion constant D\. The TF can leave the 
DNA with the dissociation rate k Q ff. We furthermore assume that the other extremity of the DNA acts 
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as a reflecting boundary |3H], possibly due to compacting proteins that obstruct further ID-diffusion at 
this position. The calculation detailed in the SI SI yields: 

tanh(£/l) 

Pt = L/t ' (4) 



with £ = \/Di/k ff, which denotes a typical distance covered sliding on DNA before dissociating. If the 
target is found, the conditional time this successful event takes on average, reads 

(Pt cosh 2 m 1 kl (sinh (f) cosh ffi) ^ 
2fc Q ff 2fc off 

However, an unsuccessful event implies that the DNA is (on average) left after the time span ra = l/k g. 
Inspection of Eq. §5§ shows that in the limit D\ — > 0, i.e. when TFs are (nearly) incapable of sliding, 
r t approaches the finite value l/(2fc ff), which is at first sight a surprising result. However, in this limit 
the probability to reach the target as given by Eq. (U) approaches zero, ensuring that meaningful results 
are obtained. It should be stressed that our model only allows target detection via sliding, and not via 
direct detection solely through three-dimensional diffusion. 



Mean search time 

To determine the mean time it takes to find the target at first we specify how often the "loop" of binding 
and unbinding events (B and U in figures [5] and [3]) is executed during an encounter with a blob. In all 
the blobs without the target this happens on average p r /(\ — Pr) times. As one loop lasts r c = 77, + 
the average time that is spent within a blob is Tbiob = T e + r c p r / (1 — p r ). 

In the blob containing the target, the average number of binding and unbinding loops is g{p r ,Pt) — 
x/(l — x)j where \ — p r (l ~ pt)- Note that the number of executed loops in blobs without target is the 
special case p t = of the general case, g(p r ,p t = 0) = p r /{l — Pr)- In the same sense figure [5] can be 
considered a special case of figure [3] The combined probability to find the target before leaving the blob 
reads p r pt/(l — p r + PrPt), consequently the probability for a failed attempt is p uns = (1 — p r )/ (1 — x)- 
Thus, a successful event during which the target is found, on average takes r suc = t& + r f + g(p r ,p t )T c , 
and an unsuccessful one r uns = r e + g(p r ,p t )T c . 

The mean total search time can be dissected into three contributions: first, the mean time the TF 
needs to arrive at the target blob for the first time. Then the mean time it takes to return to the target 
after an unsuccessful search event. The latter has to be multiplied with the average number of failed 
attempts. Finally the average time it takes to successfully bind the target at the corresponding blob has 
to be added. 

To quantify this model two parameter pairs from the random walk simulation are needed as inputs: 
the mean number of steps it takes to encounter the target blob for the first time nf^D after starting 
at a random position within the cell and how many blobs without target are encountered during this 
time, n,f ienc . Furthermore we determine the mean number of steps and blob-encounters in a random walk 
starting on a site next to the target blob: ?i r ,3D, ?ir,cnc and ending in the target blob. Altogether the 
mean total search time reads: 



T = n-f i3D T3D + nf ienc TbIob 

i -Puns / 

+ Z (T uns + n r ,3D73D + ^r.cncTtlob J 

-L Puns 

+T SUC - (6) 

This formula is the main result of our study, which will be discussed quantitatively for the case of the lac 
repressor in an E. coli cell. 
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Results 

As input parameters for our TF search model in a living cell we use data deduced from experimental 
studies. For the DNA configuration we use two parameter sets for the blob size and the number Nf, of 
basepairs within a blob: (a) r g = 15 nm and Nb = 10 4 [3"5"1HT] and (b) r g = 35 nm and JVj, = 5x 10 4 [55] . 
The volume of the nucleoid can be approximated as a cylinder of diameter d nuc — 0.24 \xm and length 
Inuc — 1.39 /urn [39] , We use a cuboid with edge lengths l x = l y = \pK x d^ uc /A ~ 213 nm and l z = l nuc . 
This corresponds to nucleoid lattices of size 7 x 7 x 46 and 3 x 3 x 20. As the E.coli genome consists of 
~ 4639 kbps, we compose a closed SAW consisting of (a) 464 blobs and (b) 92 blobs, respectively. For the 
parameter sets we create three and five sample conformations. The total cell volume can be approximated 
as a cylinder with d ce ii = 0.5 \im and length l nuc = 2.5 \im 39L Accordingly, we use embracing lattices 
of size 15 x 15 x 83 and 6 x 6 x 36 to mimic the full cell volume. Besides, we employ a — y23/5 in order 
to obtain the correct asymptotic behavior for small values of /c aS s as detailed in the SI SI and we use 
D3 = 'S^im 2 /s and D\ — 0.046^m 2 /s [19] , The results of the random walk simulation are summarized in 
tabled] 

Table 1. Simulation results 



Set 


nf,3D 


,enc 


V 


rc. r ,3D 


^r,enc 


q r 


a 


31514 


766.41 


0.0243 


18689 


463.48 


0.0248 


b 


2594.7 


175.63 


0.0677 


1291.9 


90.848 


0.0703 



Simulation results for parameter sets a and b 



A first inspection of the values of n. r /f,3D and n T u cnc shows that the ones obtained with parameter 
set a are approximately one order of magnitude larger than the ones obtained with set b. This is clear as 
set a corresponds to a finer model of the DNA, in which the respective value of r g is smaller. Next, we 
consider the ratios g/ = J"i/,enc/^/,3.D an d q r = "r,enc/ir,3D, that is the fractions of sites containing a blob 
encountered during a trajectory. The results are very close to the total fraction of sites that are occupied 
by a blob: for parameter set a, this is: 464/(15 x 15 x 83) w 0.0248 and for b: 92/(6 x 6 x 36) « 0.0710. 
This and the fact that the values for the first encounter and for the returning trajectories are similar, 
support the statement that the TF experiences an effective medium through which it diffuses |30j . If 
we only consider the mean search times, this medium is mainly characterized by the mean DNA density 
within the cell. 

Non-monotonic behavior 

In figure [4] the mean search time averaged over the ensembles with parameter set a is shown as a function 
of the association rate /c ass and the dissociation rate fc ff- 

We find a non-monotonic dependence both on the association and the dissociation rate typical for 
facilitated diffusion models: for a fixed value of fc aS s there exists a value of fc ff that minimizes the search 
time. This minimal value decreases if both rates are increased while keeping them at a constant ratio. 

In figure [5] the ratio of the search time obtained with parameter set b with the search time obtained 
with parameter set a is plotted for the same range as in figure 01 

Even though set b always yields slightly smaller search times, the results are very similar, especially 
in the range usually studied in experiments, as we will see below. Therefore in the following we solely 
consider results obtained with set a. In the SI SI we moreover show that the approach to use an 
ensemble average to obtain the mean search time is justified as the scatter between data obtained with 
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Figure 4. Mean search time. The mean search time is plotted as a function of the dissociation rate 
k g and the association rate k aas (using parameter set a). The blue bar and the blue dotted lines denote 
the range of k g which is biologically relevant [T5] . 




log 10 (feoff (S *)) 



Figure 5. Difference between the two parameter sets. The plot shows the ratio of the mean 
search time obtained with parameter set b with the ones obtained with set a. 

individual conformations is negligible (see figure SI). Only at very low values of k g , when the TF spends 
considerable time in the non-specifically bound state, the individual conformation does play a role. 

We saw that for fixed values of fc ass , there exists an optimal of k s, for which the target localization 
occurs fastest. It is insightful to study whether a living E. coli cell operates close to this point. 

Comparison to experimental results 

We choose the rates according to the results of Xie and coworkers [T5]: they measured that the lac 
repressor spends 87% of the total time non-specifically bound and determined the residence time on 
DNA tfi to be in the range 

0.3ms < tjf = 1/fcofi < 5ms. (7) 
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To incorporate these values, we calculate the fraction of time, ft, that the TF spends non-specifically 
bound. This is obtained from Eq. [5] by only considering the terms involving Td and r t . The result is 
plotted in figure HI again as a function of dissociation and association rate. 




Ioglo0off(s *)) 

Figure 6. Bound fraction of time. The fraction of time during which the TF is non-specifically 
bound is shown (using parameter set a). 

We see that contour lines of a constant fraction appear as straight lines in this log-log-plot. A 
numerical analysis yields that the condition ft — 0.87 is fulfilled for 

log^WM-V 1 )) = 1.041og 10 (fc off (s- 1 )) + 2.76. (8) 

The observation that the slope of this curve is (nearly) unity, reflects the fact that specifying the bound 
fraction of time is equivalent to specifying the equilibrium binding constant which is simply given by the 
ratio of fc ass and k g. We plug Eq. [8] into our model and plot the resulting mean search time as a function 
of the single residual parameter k g in figure [7] in the range given by Eq. [7] Additionally, in figure [7] we 
plot the minimal search time in this regime which is obtained by choosing the optimal value of k &ss . 

In both cases we obtain a monotonically decreasing function of fe ff. Most interestingly, the values 
obtained in this biologically relevant parameter regime are only marginally larger than the optimal ones. 
At k ff > 500s -1 the two data sets nearly lie on top of each other. This means that within our model 
an E. coli cell seems to operate quite close to conditions, which are optimal for target localization. At 
feofj = 200s~ 1 , which was used in the discussion in ref. [301 . we obtain r 311 s. This is approximately 
12% below the experimental result 6 x 59s = 354s [19], implying a very favorable agreement. 

Local searches 

There is some evidence that many TFs are produced close to their target positions, a phenomenon called 
colocalization [T4j[49] . These local searches would obviously be faster than a global search starting at 
a random position within the cell. To quantify this in figure [8] we plot how many percent of the total 
search time is still needed to find the target if the TF starts its search in the target blob while all other 
parameters remain unchanged. 

In mathematical terms this corresponds to omitting the terms in the first line of Eq. [51 We see that 
only for relatively large values of fc a ss an appreciable acceleration is obtained for local searches. This is 
clear as large values of the association rate imply that all the blobs encountered en route act as traps 
slowing down the transport. Interestingly, in the regime typical for the interior of cells the acceleration 
is of little amount. This can also be interpreted in the more general context of "geometry-controlled 
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Figure 7. Mean search time and minimal search time. The mean search time and the minimal 
search time (with appropriately chosen fc ass ) are plotted as a function of the dissociation rate at 
parameters relevant for the interior of living cells. 




log 10 (> off (s : )) 

Figure 8. Acceleration due to local searches. The ratio of the time needed in a local search with 
the one in a global search (with parameter set a) is shown. 

kinetics" , see the works of Benichou and coworkers [50l[51] . These authors showed that for non-compact 
exploration of space - as is the case in the present model - the initial position of a searching particle has 
little influence. 

Discussion 

We analyzed the facilitated diffusion mechanism in a living cell using a coarse-grained model of the 
bacterial genome. Just like in dilute in vitro systems there is a non-monotonic dependence both on the 
dissociation rate and the association rate of TFs from and to DNA. The respective optimal conditions 
mark a trade-off between spending too much time on DNA where the motion is rather slow, but the 
target can be found, and spending too much time in the cytoplasm where the motion is faster, but the 
TF is insensitive to the target. 
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When calculating the mean search time as an input from our random walk simulation we solely 
use the mean number of steps taken and the number of blobs encountered during the trajectory. This 
corresponds to treating the nucleoid body as an effective medium through which the TF diffuses, which 
agrees with the observations made by Koslover et al. that within a short time span the TF starts an 
effective diffusive motion [30] . Accordingly, we see that the exact values of the parameters describing the 
DNA conformation have only little effect on our results. Only the fact that there is an effective medium 
characterized by the DNA density matters. 

Calibrating our results with the experimental observation that the TF spends 87% of the time non- 
specifically bound [19] , we obtain search times that only slightly underestimate the experimentally known 
results. In a previous study we showed that the introduction of a search and a recognition state in order 
to resolve the speed-stability paradox slows down the search [33]. Thus, a refined model taking this effect 
into account could yield a result even closer to the experimental one. 

Most importantly, within our model the results in the biologically relevant regime of dissociation rates 
are quite close to the ones minimizing the search time, indicating that living E. coli cells function near 
conditions optimal for TF target location. 

Our results for the mean search times are similar to those obtained by Koslover et al. |30j . However, 
in their model for in vivo facilitated diffusion they distribute the DNA over the entire cell volume and 
assume a random coil configuration. If one were confining the DNA to the smaller nucleoid volume, the 
effective DNA-TF contact radius in that model would then become smaller than the average distance 
between DNA segments. Besides, our model is less idealized. In that sense our current approach has the 
advantage that the DNA is realistically confined to the nucleoid volume, and based on input parameters 
deduced from experimental studies we also obtain mean search times, that are very close to experimental 
in vivo values. Moreover, our model offers the advantage that in future studies additional information 
may be deduced, for example, by studying the underlying probability densities of ^Zr,3D: ^r,cnc; 

etc., in 

addition to their mean values determined here. 
Colocalization effects 

Comparing the mean search times for TFs starting at a random position in the cell volume with those 
TFs that already start close to the target, we only observe a minor acceleration. This is due to the fact 
that most of the search time is spent returning to the target blob after a failed attempt to find the target. 
For a wide range of parameters the first encounter with the target blob only represents a small fraction 
of the whole search time. Leaving the picture of mean values for the search time of an ensemble of TFs, 
on the level of single trajectories immediate returns to the target blob are indeed possible and thus may 
lead to search times much shorter than the average search time. Such scenarios may in fact be relevant 
for biological cells. 

Should observations of anomalous diffusion for TFs in the cytoplasm of living cells be substantiated, 
the effect of colocalization should become significantly more pronounced, if the nature of the exploration 
of space is compact [50,51 : subdiffusion implies an increased occupation probability near the initial 
position [23,52,53 , and thus increases the likelihood for successful TF-DNA binding after repeated at- 
tempts. In that sense subdiffusion may even be beneficial for molecular processes in living cells, as argued 
recently [520(55] . 

We believe that this relatively simple model for facilitated diffusion in vivo will instigate new experi- 
ments and more detailed theories, to ultimately obtain a full understanding of bacterial gene regulation. 



12 



Acknowledgments 

This work was supported by Academy of Finland (FiDiPro scheme): www.aka.fi/eng; and German Federal 
Ministry for Education and Research: www. bmbf.dc/en/index.php. 

References 

1. Riggs AD, Bourgeois S, Cohn M (1970) The lac repressor-operator interaction: Iii. kinetic studies. 
J Mol Biol 53: 401 - 417. 

2. von Smoluchowski M (1916) Three presentations on diffusion, molecular movement according to 
brown and coagulation of colloid particles. Physikal Zeitschr 17: 557-571. 

3. Berg OG, Winter RB, Von Hippel PH (1981) Diffusion-driven mechanisms of protein translocation 
on nucleic acids. 1. models and theory. Biochemistry 20: 6929-6948. 

4. Winter RB, Berg OG, Von Hippel PH (1981) Diffusion-driven mechanisms of protein translocation 
on nucleic acids. 3. the escherichia coli lac repressor-operator interaction: kinetic measurements 
and conclusions. Biochemistry 20: 6961-6977. 

5. Slutsky M, Mirny L (2004) Kinetics of protein-dna interaction: Facilitated target location in 
sequence-dependent potential. Biophys J 87: 4021-4035. 

6. Lomholt MA, van den Brock B, Kalisch SMJ, Wuite GJL, Metzler R (2009) Facilitated diffusion 
with dna coiling. Proc Natl Acad Sci USA 106: 8204-8208. 

7. Zhou HX (2011) Rapid search for specific sites on dna through conformational switch of nonspecif- 
ically bound proteins. Proc Natl Acad Sci USA 108: 8651-8656. 

8. Shcinman M, Bcnichou O, Kafri Y, Voituriez R (2012) Classes of fast and specific search mecha- 
nisms for proteins on dna. Rep Prog Phys 75: 026601. 

9. Mirny L, Slutsky M, Wunderlich Z, Tafvizi A, Leith J et al. (2009) How a protein searches for its 
site on DNA: the mechanism of facilitated diffusion. J Phys A Math Gen 42: 434013 

10. Kolomeisky AB (2011) Physics of protein-DNA interactions: mechanisms of facilitated target 
search. Phys Chem Chcm Phys 13: 2088-2095 

11. Sokolov I, Metzler R, Pant K, Williams M (2005) Target search of n sliding proteins on a dna. 
Biophys J 89: 895-902. 

12. Gowers DM, Wilson GG, Halford SE (2005) Measurement of the contributions of Id and 3d path- 
ways to the translocation of a protein along dna. Proc Natl Acad Sci USA 102: 15883-15888. 

13. Wang YM, Austin RH, Cox EC (2006) Single molecule measurements of repressor protein Id 
diffusion on dna. Phys Rev Lett 97: 048302. 

14. Kolcsov G, Wunderlich Z, Laikova ON, Gelfand MS, Mirny LA (2007) How gene order is influenced 
by the biophysics of transcription regulation. Proc Natl Acad Sci USA 104: 13948-13953. 

15. Bonnet I, Biebricher A, Porte PL, Loverdo C, Benichou O, et al. (2008) Sliding and jumping of 
single ecorv restriction enzymes on non-cognate dna. Nucleic Acids Res 36: 4118-4127. 

16. van den Brock B, Lomholt MA, Kalisch SMJ, Metzler R, Wuite GJL (2008) How dna coiling 
enhances target localization by proteins. Proc Natl Acad Sci USA 105: 15738-15742. 



13 



17. Konopka MC, Shkel IA, Cayley S, Record MT, Weisshaar JC (2006) Crowding and confinement 
effects on protein diffusion in vivo. J Bacteriol 188: 6115-6123. 

18. Kiihn T, Ihalainen TO, Hyvaluoma J, Dross N, Willman SF, et al. (2011) Protein diffusion in 
mammalian cell cytoplasm. PLoS One 6: e22962. 

19. Elf J, Li GW, Xie XS (2007) Probing transcription factor dynamics at the single-molecule level in 
a living cell. Science 316: 1191-1194. 

20. Hammar P, Lcroy P, Mahmutovic A, Marklund EG, Berg OG, et al. (2012) The lac repressor 
displays facilitated diffusion in living cells. Science 336: 1595-1598. 

21. Minton AP (2001) The influence of macromolecular crowding and macromolecular confinement on 
biochemical reactions in physiological media. J Biol Chem 276: 10577-10580. 

22. Morelli MJ, Allen RJ, ten Wolde PR (2011) Effects of macromolecular crowding on genetic net- 
works. Biophys J 101: 2882-2891. 

23. Golding I, Cox EC (2006) Physical nature of bacterial cytoplasm. Phys Rev Lett 96: 098102. 

24. Weber SC, Spakowitz AJ, Theriot JA (2010) Bacterial chromosomal loci move subdiffusively 
through a viscoelastic cytoplasm. Phys Rev Lett 104: 238102. 

25. Jeon JH, Tejedor V, Burov S, Barkai E, Selhuber-Unkel C, et al. (2011) In Vivo anomalous diffusion 
and weak ergodicity breaking of lipid granules. Phys Rev Lett 106: 048103. 

26. Metzler R, Klafter J (2000) The random walk's guide to anomalous diffusion: a fractional dynamics 
approach. Phys Rep 339: 1-77. 

27. Barkai E, Garini Y, Metzler R (2012) Strange kinetics of single molecules in living cells. Phys 
Today 65(8): 29-35. 

28. Banks D, Fradin C (2005) Anomalous diffusion of proteins due to molecular crowding. Biophys J 
89: 2960-2971. 

29. Weiss M, Eisner M, Kartberg F, Nilsson T (2004) Anomalous subdiffusion is a measure for cyto- 
plasmic crowding in living cells. Biophys J 87: 3518-3524. 

30. Koslover EF, Diaz de la Rosa MA, Spakowitz AJ (2011) Theoretical and computational modeling 
of target-site search kinetics in vitro and in vivo. Biophys J 101: 856-865. 

31. Benichou O, Chevalier C, Meyer B, Voituriez R (2011) Facilitated diffusion of proteins on chro- 
matin. Phys Rev Lett 106: 038102 

32. Foffano G, Marenduzzo D, Orlandini E (2012) Facilitated diffusion on confined dna. Phys Rev E 
Stat Nonlin Soft Matter Phys 85: 021919. 

33. Rocha EPC (2008) The organization of the bacterial genome. Annu Rev Genet 42: 211-233. 

34. Kim J, Yoshimura SH, Hizume K, Ohniwa RL, Ishihama A, et al. (2004) Fundamental structural 
units of the escherichia coli nucleoid revealed by atomic force microscopy. Nucleic Acids Res 32: 
1982-1992. 

35. Postow L, Hardy C, Arsuaga J, Cozzarelli N (2004) Topological domain structure of the escherichia 
coli chromosome. Genes Dev 18: 1766-1779. 



14 



36. Romantsov T, Fishov I, Krichevsky O (2007) Internal structure and dynamics of isolated escherichia 
coli nucleoids assessed by fluorescence correlation spectroscopy. Biophys J 92: 2875-2884. 

37. Umbarger MA, Toro E, Wright MA, Porreca GJ, Bau D, et al. (2011) The three-dimensional 
architecture of a bacterial genome and its alteration by genetic perturbation. Mol Cell 44: 252- 
264. 

38. Viollier PH, Thanbichler M, McGrath PT, West L, Mccwan M, ct al. (2004) Rapid and sequential 
movement of individual chromosomal loci to specific subcellular locations during bacterial dna 
replication. Proc Natl Acad Sci USA 101: 9257-9262. 

39. Jun S, Wright A (2010) Entropy as the driver of chromosome segregation. Nat Rev Microbiol 8: 
600-607. 

40. Jung Y, Jeon C, Kim J, Jeong H, Jun S, et al. (2012) Ring polymers as model bacterial chromo- 
somes: confinement, chain topology, single chain statistics, and how they interact. Soft Matter 8: 
2095-2102. 

41. Buenemann M, Lenz P (2010) A geometrical model for dna organization in bacteria. PLoS One 5: 
cl3806. 

42. Junior I, Martin O, Kepes F (2010) Spatial and topological organization of dna chains induced by 
gene co-localization. PLoS Comput Biol 6: el000678. 

43. Fritsche M, Li S, Heermann DW, Wiggins PA (2012) A model for escherichia coli chromosome 
packaging supports transcription factor-induced dna domain formation. Nucleic Acids Res 40: 
972-980. 

44. Bauer M, Metzler R (2012) Generalized facilitated diffusion model for dna-binding proteins with 
search and recognition states. Biophys J 102: 2321-2330. 

45. Sheinman M, Kafri Y (2009) The effects of intersegmental transfers on target location by proteins. 
Phys Biol 6: 016003 

46. Madras N, Orlitsky A, Shepp L (1990) Monte carlo generation of self-avoiding walks with fixed 
endpoints and fixed length. J Stat Phys 58: 159-183. 

47. Reingruber J, Holcman D (2010) Narrow escape for a stochastically gated brownian ligand. J Phys 
Condens Matter 22: 065103. 

48. Coppey M, Benichou O, Voituriez R, Moreau M (2004) Kinetics of target site localization of a 
protein in DNA: a stochastic approach Biophys J 87: 1640-1649 

49. Wunderlich Z, Mirny LA (2008) Spatial effects on the speed and reliability of protein-dna search. 
Nucleic Acids Res 36: 3570-3578. 

50. Benichou O, Chevalier C, Klafter J, Meyer B, Voituriez R (2010) Geometry-controlled kinetics. 
Nat Chem 2: 472-477 

51. Meyer B, Chevalier C, Voituriez R, Benichou O (2010) Universality classes of first-passage-time 
distribution in confined media. Phys Rev E Stat Nonlin Soft Matter Phys 83: 051116 

52. Guigas G, Weiss M (2008) Sampling the cell with anomalous diffusion - The discovery of slowness. 
Biophys J 94: 90-94. 



15 



53. Lomholt MA, Zaid IM, Metzler R (2007) Subdiffusion and weak ergodicity breaking in the presence 
of a reactive boundary. Phys Rev Lett 98: 200603. 

54. Hcllmann M, Hcermann DW, Weiss M (2012) Enhancing phosphorylation cascades by anomalous 
diffusion. EPL 97: 58004. 

55. Sereshki LE, Lomholt MA, Metzler R (2012) A solution to the subdiffusion-efhciency paradox: 
inactive states enhance reaction efficiency at subdiffusion conditions in living cells. EPL 97: 20008. 



16 



Supporting Information SI 

In this supporting information we detail the explicit calculations which are beyond the scope of the main 
text. 

1 Microscopic model 
1.1 Association probability 

To relate p r to the non-specific association rate fc ass per base pair (in units of M s ), we solve the 
following diffusion equation for the TF's probability c(r, t) to be at position r at time t: 



dc(r, t) _ f L> 3 Ac(r, t) - kc(t, t), for < r < 



Of 



D 3 Ac(r,t), 



for r„ < r < r 2 ' 



(SI) 



with k = nk^ssNb, where n denotes the density of DNA and the number of basepairs within the blob. 
-D3 denotes the 3D-diffusion constant and r g the blob's radius of gyration. The differential equation is 
subject to the initial condition 



c(r,t = 0) 



c = 3/(47rr^), for < r < r g 



0. 



for r„ < r < r 2 



(S2) 



and the boundary condition c{r = T2,t) = 0. Thus, r 2 represents a cutoff-radius at which the TF is 
assumed to have definitely left the domain of the blob. We use n — cq as we study the situation where 
one TF is in the blob containing one DNA chain. 

We define the Laplace transform f{u) of a function f(t) through: 



/(«) = / /(*) exp(-ut)dt. 



(S3) 



In Laplace space the differential equation IS 1 1 reads: 

_ J c + £> 3 Ac(r, u) - Kc(r, u), for < r < r g 

D 3 Ac(t,u), for r g < r < ' 



uc(u, r) 



(S4) 



From its solution the flux out of the outer sphere jout(w) and the binding flux jbind('w) in the inner sphere 
can be obtained via: 

2 dc(u,r) 



and 



We obtain 



and furthermore 



jout(w) = -4:nr 2 D 3 



dr 



jbind(w) = 47tk J drr c(ti,r). 





jout(u) 



qir g coth(qir ff ) - 1 



r^qf smh(q 2 5r) coth(gir 9 ) + ^ coih(q 2 Sr) ' 



(S5) 
(S6) 
(S7) 



JbindW 



r^ql u + k 



(glTg coth(gir 3 ) - 1)(1 + r g q 2 coth(q 2 6r)) 
coth(^ir g ) + g coth(q 2 5r) 



(S8) 
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where qi = \J^^, 92 = y/u/D 3 and Sr = r 2 - r g . 
A Taylor series around u = then yields 



jbind(u) ^ Pr(l - r 6 u), (S9) 

and 

jout(«)^(l-Pr)(l-r e u). (S10) 

We obtain 

a + (a -1) 7 W ( j 



where we introduced a = r 2 /r g , 7 = r g y/ 'n/D 3 and the auxiliary function ^(7) = ( 7 coth(7) - l)/ 7 2 [SI]. 
The average time it takes for binding reads 

+ (12 - 15a + 2 7 2 (1 - a) 2 ) 7 2 '<t> 2 '{l)} 
x(a + (a- 1)7 2 0(7)) _1 
x(a + { 1 2 (a-l)-3a)^( 1 )y 1 . (S12) 

This equation is true for arbitrary values of a. In the main text we explicitly state the case a = 2. 
However, in the results section we use a = y / 23/5 « 2.14, as described in the last section of this SI. 
The average time the TF needs for leaving the blob is given by 

Te = h < a(3 ~ r 1(7)) + 72 {(3a ~ 2)0(7) 

+^(i-«) 2 ]4(i-^(t) 



1.2 Target finding probability 



3 

x(a + (a-l) 7 2 0( 7 ))- 1 . (S13) 



To calculate the probability to find the target before dissociating, we consider the one-dimensional diffu- 
sion problem 

dc(z,t) ^ d 2 c(z,t) , 

-^gf 1 = Di^^ - k oS c(z,t), (S14) 

subject to the initial condition c(z, t — 0) = 1/L and the boundary conditions c(z = 0, t) — and 
0. In Laplace space with respect to time we obtain the following solution: 



dc(z,t) 
dz 



z=L 



cosh((L-z)jH±^ s 
c(«, z) = - \ , I 1 , V 1 1 I (S15) 

L(u+k oS )y C0Sh(Lv /^) ! 



A Taylor series of target («) = D 1 - • •• • 



dz 



in u yields: 

z=0 



. , tanh(L/£) u ( 1 tanh(£/7)\ ,_.,„, 



where ^ = \/Di/k oS . 
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Figure SI. Ratio of the mean search times obtained with individual conformations with the respective 
ensemble averaged mean search time at fc ass = 10 5 M s _1 . 



This corresponds to a target finding probability of 



tanh(L/l) 

Pt = L/i ■ (S17) 
The average time it takes to find the target reads 

Tt = 2k^ i 1 ~ smh(L/£)cosh(L/£)) ' (S18) 

2 Justification for the use of the ensemble average 

In Figure [Si] we plot the ratio of the mean search time for all the eight individual conformations with the 
mean search time of the corresponding ensemble average at k ass — 10 5 M~ 1 s~ 1 . 

Apparently all the individual curves only scatter about one percent around the value obtained with 
the ensemble average. Thus it appears appropriate always to use the latter in the main text. 



3 Derivation of a = ^/23/5 



In principle the parameter a which represents the ratio of the cutoff-radius T2 and the blob's radius of 
gyration r± is a free parameter which can be used to refine the model. However, in the limit k — > 0, that 
is when no binding to DNA occurs or when there is no DNA present, the escape time r e from a blob 
should coincide with the free diffusion time T3D- Now using Eq. IS 131 



Equalizing this with t^o = wjf- yields a = \/ Consequently, this value was chosen in the main text. 
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